This is slightly old, and eventually needs re-writing.

Evaluating hotspots

For the moment, we shall presume we are only comparing grid based predictions (though with an eye to possibly very small grid cells arising from continuous predictions).

We first discuss a very high-level overview of a "pipeline" for producing predictions, scoring them, and comparing different predictions. We then provide more details on each stage.

Prediction pipeline

Certain grid cells are flagged. (E.g. A prediction technique assigns a "risk" to each cell, and the top 5% of cells are flagged.)
The prediction is compared against the actuality. (E.g. For the next day, we count the number of crime events which occurred in flagged cells, compared to the total crime count.) This is currently the focus of work.
Other properties of the prediction are computed. (E.g. a measure of "compactness" or "clumpiness" of the flagged areas; broadly we seek to assess how practically useful the prediction is.)
This is repeated for many time periods. (E.g. For each consequentive day over 6 months of data.)
The resulting time series are compared. (E.g. through the use of summary statistics, or using a statistical test.)

Other properties of the prediction

TODO: Read [1] and [2]

References

(Not in alphabetical order, sorry, so that adding a new article doesn't require changing all the numbers above :smile:)

Bowers at al, "Prospective hot-spotting: The future of crime mapping?", Brit. J. Criminol. 44 (2004) 641--658. DOI:10.1093/bjc/azh036
Adepeju et al, "Novel evaluation metrics for sparse spatiotemporal point process hotspot predictions - a crime case study", International Journal of Geographical Information Science, 30:11, 2133-2154, DOI:10.1080/13658816.2016.1159684



In [ ]: